Between recognition and synthesis - 300 bits/second speech coding

نویسندگان

  • Mohamed Ismail
  • Keith Ponting
چکیده

This paper describes a system for speech coding designed to operate at 300 bits/sec and below. A continuous speech recogniser is used to transcribe incoming speech as a sequence of sub-word units termed acoustic segments. Prosodic information is combined with segment identity to form a serial data stream suitable for transmission. A rule-based system maps segment identity and prosodic information to parameters suitable for driving a parallel formant speech synthesiser. Acoustic segment Hidden Markov Models (HMMs) are shown to perform as well as conventional phone HMMs during recognition. A segment error rate of 3.8 % was achieved in a speaker-dependent, task-dependent configuration. An average data rate of 262 bits/sec was obtained. Speech from the synthesiser was better than obtainable from a purely textual representation though not as good as 2400 bit/sec Linear Predictive Coding (LPC) vocoded speech.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Syllable-based pitch encoding for low bit rate speech coding with recognition/synthesis architecture

Current HMM-based low bit rate speech coding systems work with phonetic vocoders. Pitch contour coding (on frame or phoneme level) is usually fairly orthogonal to other speech coding parameters. We make an assumption in our work that the speech signal contains supra-segmental cues. Hence, we present encoding of the pitch on the syllable level, used in the framework of a recognition/synthesis sp...

متن کامل

Dynamic Unit Selection for Very Low Bit Rate Coding at 500 bits/sec

This paper presents a new unit selection process for Very Low Bit Rate speech encoding around 500 bits/sec. The encoding is based on speech recognition and speech synthesis technologies. The aim of this approach is to use at best the speech corpus of the speaker. The proposed solution uses HMM modelling for the recognition of elementary speech units. The HMM are first trained in an unsupervised...

متن کامل

Towards a unified model for low bit-rate speech coding using a recognition-synthesis approach

This paper proposes a recognition-synthesis approach to speech coding which uses an underlying formant trajectory model for both recognition and synthesis. It is argued that this “unified” approach to coding has the potential to achieve low data rates whilst preserving speech quality and important paralinguistic information. A simple coding scheme is described which establishes the principles o...

متن کامل

Stress and accent transmission in HMM-based syllable-context very low bit rate speech coding

In this paper, we propose a solution to reconstruct stress and accent contextual factors at the receiver of a very low bitrate speech codec built on recognition/synthesis architecture. In speech synthesis, accent and stress symbols are predicted from the text, which is not available at the receiver side of the speech codec. Therefore, speech signal-based symbols, generated as syllable-level log...

متن کامل

Speech Recognition of the letter 'zha' in Tamil Language using HMM

Speech signals of the letter ‘zha’ (H) in Tamil language of 3 males and 3 females were coded using an improved version of Linear Predictive Coding (LPC). The sampling frequency was at 16 kHz and the bit rate was at 15450 bits per second, where the original bit rate was at 128000 bits per second with the help of wave surfer audio tool. The output LPC cepstrum is implemented in first order three ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997